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Abstract 

Given an undirected graph G = {Vg, Eg) and a fixed “pattern” graph H = {Vh, Eh) with k 
vertices, we consider the il-Transversal and il-Packing problems. The former asks to find the 
smallest S CVg such that the subgraph induced by Vb \ S' does not have H as a subgraph, and 
the latter asks to find the maximum number of pairwise disjoint fc-subsets Si,..., S^ C Vg such 
that the subgraph induced by each Si has H as a subgraph. 

We prove that if H is 2-connected, iJ-Transversal and il-Packing are almost as hard to 
approximate as general fc-Hypergraph Vertex Cover and fc-Set Packing, so it is NP-hard to 
approximate them within a factor of il{k) and fl{k) respectively. We also show that there is a 
1-connected H where il-Transversal admits an 0(log/c)-approximation algorithm, so that the 
connectivity requirement cannot be relaxed from 2 to 1. For a special case of 7S-Transversal 
where is a (family of) cycles, we mention the implication of our result to the related Feedback 
Vertex Set problem, and give a different hardness proof for directed graphs. 
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version that appears in the proceedings of APPROX 15. 
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1 Introduction 


Given a collection of subsets Sm of the underlying set U, the Set Transversal problem asks 

to find the smallest subset of U that intersects every Si , and the Set Packing problem asks to find 
the largest subcollection 5^^,which are pairwise disjoint Q It is clear that optimum of the 
former is always at least that of the latter (i.e. weak duality holds). Studying the (approximate) 
reverse direction of the inequality (i.e. strong duality) as well as the complexity of both problems 
for many interesting classes of set systems is arguably the most studied paradigm in combinatorial 
optimization. 

This work focuses on set systems where the size of each set is bounded by a constant k. With 
this restriction, Set Transversal and Set Packing are known as fc-Hypergraph Vertex cover (A:-HVC) 
and A:-Set Packing (A:-SP), respectively. This assumption significantly simplifies the problem since 
there are at most sets. While there is a simple factor /c-approximation algorithm for both 
problems, it is NP-hard to approximate /c-HVC and fc-SP within a factor less than k — 1 [2Tj and 
0(l4fc) |3Z! respectively. 

Given a large graph G = {Vg-, Eq) and a fixed graph H = {Vh, Eh) with k vertices, one of the 
natural attempts to further restrict set systems is to set U = Vq, and take the collection of subsets 
to be all copies of in G (formally defined in the next subsection). This natural representation in 
graphs often results in a deeper understanding of the underlying structure and better algorithms, 
with Maximum Matching (H = K 2 ) being the most well-known example. Kirkpatrick and Hell |42j 
proved that Maximum Matching is essentially the only case where //-Packing can be solved exactly 
in polynomial time — unless H is the union of isolated vertices and edges, it is NP-hard to decide 
whether Vq can be partitioned into fc-subsets each inducing a subgraph containing H. A similar 
characterization for the edge version (i.e. U = Eg) was obtained much later by Dor and Tarsi |26j . 

We extend these results by studying the approximability of //-Transversal and //-Packing. We 
use the term strong inapproximability to denote NP-hardness of approximation within a factor 
^{k/polylog{k)). We give a simple sufficient condition that implies strong inapproximability — 
if H is 2-vertex connected, //-Transversal and //-Packing are almost as hard to approximate as 
/-HVG and k-SP. We also show that there is a 1-connected H where //-Transversal admits an 
0(log/)-approximation algorithm, so 1-connectivity is not sufficient for strong inapproximability 
for //-Transversal. It is an interesting open problem whether 1-connectivity is enough to imply 
strong inapproximability of //-Packing, or there is a class of connected graphs where //-Packing 
admits a significantly nontrivial approximation algorithm (e.g. factor for some e < 1). 

Our results give an unihed answer to questions left open in many independent works studying 
a special case where // is a cycle or clique, and raises some new open questions. In the subsequent 
subsections, we state our main results, review related work, and state potential future directions. 

1.1 Problems and Our Results 

Given an undirected graphs G = {Vg, Eg) and H = {Vh, Eh) with \ Vh\ = k, we define the following 
problems. 

• //-Transversal asks to find the smallest E C Vg such that the subgraph of G induced by 
Vg\ E does not have H as a subgraph. 

^ These problems are called many different names in the literature. Set Transversal is also called Hypergraph 
Vertex Cover, Set Cover (of the dual set system), and Hitting Set. Set Packing is also called Hypergraph Matching. 
We try to use Transversal / Packing unless another name is established in the literature (e.g. fc-Hypergraph Vertex 
Cover). 
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• //-Packing asks to find the maximum number of pairwise disjoint /c-subsets of Si,...,Sm of 
Vg such that the subgraph induced by each Si has H as a subgraph. 

Our main result states that 2-connectivity of H is sufficient to make //-Transversal and H- 
Packing hard to approximate. 

Theorem 1. If H is a 2-vertex connected with k vertices, unless NP C BPP, no polynomial time 
algorithm approximates H-Transversal within a factor better than k — 1, and H-Packing within a 
factor better than 

Let fe-Star denote Ki^k-i, the complete bipartite graph with 1 and k — 1 vertices on each side. 
The following theorem shows that fe-Star Transversal admits a good approximation algorithm, 
so the assumption of 2-connectedness in Theorem is required for strong inapproximability of 
//-Transversal. 

Theorem 2. k-Star Transversal can be approximated within a factor of O(log k) in polynomial 
time. 

This algorithmic result matches f/(log A;)-hardness of fc-Star Transversal via a simple reduction 
from Minimum Dominating Set on degree-/; graphs m- This problem has the following equivalent 
but more natural interpretation: given a graph G = (Vg,Eg), find the smallest F CVg such that 
the subgraph induced by Vg \ T’ has maximum degree at most k — 2. Our algorithm, which uses 
iterative roundings of 2-rounds of Sherali-Adams hierarchy of linear programming (LP) followed 
by a simple greedy algorithm for Constrained Set Cover, is also interesting in its own right, but we 
defer the details to Appendix 

Our hardness results for transversal problems rely on hardness of fe-HVC which is NP-hard to 
approximate within a factor better than A: — 1 |24j . Our hardness results for packing problems rely 
on hardness of Maximum Independent Set on graphs with maximum degree k and girth strictly 
greater than g (MlS-k-g). Almost tight inapproximability of MIS on graphs with maximum degree 
k (MIS-A:) is recently proved in Chan jl 2 | . which rules out an approximation algorithm with ratio 
better than D( We are able to extend his result to MIS-A:- 5 r with losing only a polylogarithmic 

factor. All applications in this work require g = 0(A:). 

Theorem 3. For any constants k and g, unless NP C BPP, no polynomial time algorithm approx¬ 
imates MIS-k-g within a factor 

We remark that assuming the Unique Games Conjecture (UGC) slightly improves our hardness 
ratios through better hardness of A:-HVC m and MIS-A: [1], and even simplifies the proof for 
some problems (e.g. A:-Clique Transversal) through structured hardness of A:-HVC [ 6 ]. Indeed, an 
earlier (unpublished) version of this work [33] relied on the UGC to prove that MIS-A:-A: is hard to 
approximate within a factor of U( ^ 4 , ), while only giving U(\/fc)-factor hardness without it. Now 
that we obtain almost matching hardness, we focus on proving hardness results without the UGC. 

1.2 Related Work and Special Cases 

After the aforementioned work characterizing those pattern graphs H admitting the existence of 
a polynomial-time exact algorithm for //-Packing |42( 126). Lund and Yannakakis |48j studied the 
maximization version of //-Transversal (i.e. find the largest W C Vg such that the subgraph in¬ 
duced by V' does not have H as a subgraph), and showed it is hard to approximate within factor 
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2logl/2 -n fQj. 

any e > 0. They also mentioned the minimization version of two extensions oi H- 
Transversal. The most general node-deletion problem is APX-hard for every nontrivial hereditary 
(i.e. closed under node deletion) property, and the special case where the property is characterized 
by a finite number of forbidden subgraphs (i.e. {Hi, ..., //i}-Transversal in our terminology) can be 
approximated with a constant ratio. They did not provide explicit constants (one trivial approxi¬ 
mation ratio for {Hi, ...,///}-Transversal is max(|VHi|,..., \Vhi\)), and our result can be viewed as 
a quantitative extension of their inapproximability results for the special case of i?-Transversal. 

Pf-Transversal / Packing has been also studied outside the approximation algorithms com¬ 
munity. The duality between our i?-Transversal and i7-Packing is closely related to the famous 
Erdos-Posa property actively studied in combinatorics. The recent work of Jansen and Marx |39j 
considered problems similar to our fJ-Packing with respect to fixed-parameter tractability (FPT). 

Many other works on fJ-Transversal / Packing focus on a special case where H is a cycle or 
clique. We define A:-Cycle (resp. /c-Clique) to be the cycle (resp. clique) on k vertices. 

1.2.1 Cycles 

The initial motivation for our work was to prove a super-constant factor inapproximability for the 
Feedback Vertex Set (FVS) problem without relying on the Unique Games Conjecture. Given 
a (directed) graph G, the FVS problem asks to find a subset F of vertices with the minimum 
cardinality that intersects every cycle in the graph (equivalently, the induced subgraph G\F is 
acyclic). One of Karp’s 21 NP-complete problems, FVS has been a subject of active research for 
many years in terms of approximation algorithms and fixed-parameter tractability (FPT). For FPT 
results, see p [El Ea [IS] and references therein. 

FVS on undirected graphs has a 2-approximation algorithm PlHllIB], but the same problem is 
not well-understood in directed graphs. The best approximation algorithm [snEniEi achieves an 
approximation factor of O (log n log log n). The best hardness result follows from a simple approxi¬ 
mation preserving reduction from Vertex Cover, which implies that it is NP-hard to approximate 
FVS within a factor of 1.36 |25|. Assuming UGC [l0|, it is NP-hard to approximate FVS in directed 
graphs within any constant factor |34l [53] (we give a simpler proof in [33]). The main challenge is 
to bypass the UGC and to show a super-constant inapproximability result for FVS assuming only 

P / NP or NP 2 BPP. 

By Theorem we prove that A:-Cycle Transversal is hard to approximate within factor fl(/c). 
The following theorem improves the result of Theorem]^ in the sense that in the completeness case, 
a small number of vertices not only intersect cycles of length exactly k, but intersect every cycle of 

length 3,4,...,0(411,;). 

Theorem 4. Fix an integer k > 3 and e G (0,1). Given a graph G = {Vg,Eg) (directed or 
undirected), unless NP C BPP, there is no polynomial time algorithm to tell apart the following two 
cases. 


• Completeness: There exists F C Vg with + e fraction of vertices that intersects every 
cycle of at most length 0( ^J°j^g^ ) (hidden constant in O depends on k and e). 

• Soundness: Every subset F with less than 1 — e fraction of vertices does not intersect at least 
one cycle of length k. Equivalently, any subset with more than e fraction of vertices has a 
cycle of length exactly k in the induced subgraph. 

This can be viewed as some (modest) progress towards showing inapproximability of FVS in 
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the following sense. Consider the following standard linear programming (LP) relaxation for FVS. 


min Xy subject to 

vSiVc 


Xy>l V cycle C , 

v&C 


and 0 < x.,; < 1 Vu G Vg 


The integrality gap of the above LP is upper bounded by O(logn) for undirected graphs [7] and 
O(lognloglogn) for directed graphs [30]. Suppose in the completeness case, there exists a set of 
measure c that intersects every cycle of length at most log^'^n (or any number bigger than the 
known integrality gaps). If we remove these vertices and consider the above LP on the remaining 
subgraphs, since every cycle is of length at least log^'^n, setting Xy = 1/log^'^n is a feasible 
solution, implying that the optimal solution to the LP is at most n/log^'^ n. Since the integrality 
gap is at most O (log n log log n), we can conclude that the remaining cycles can be hit by at most 
0(n log log re/log®'^ n) = o(re) vertices, extending the completeness result to every cycle. Thus, 
improving our result to hit cycles of length a;(logreloglogre) in the completeness case will prove a 
factor-a;(l) inapproximability of FVS. 


Another interesting aspect about Theorem is that it also holds for undirected graphs. This 
should be contrasted with the fact that undirected graphs admit a 2-approximation algorithm for 
FVS, suggesting that to overcome log re-cycle barrier mentioned above, some properties of directed 
graphs must be exploited. Towards developing a directed graph specific approach, we also present 
a different reduction technique called labeling gadget in Appendix B.3| to prove a similar result 
only on directed graphs. It has an additional advantage of being derandomized and assumes only 
P / NP. 


For cycles of bounded length, Kortsarz et ah [Mj studied fc-Cycle Edge Transversal, and sug¬ 
gested a, (k — I)-approximation algorithm as well as proved that improving the ratio 2 for will 
have the same impact on Vertex Cover, refuting the Unique Games Conjecture m- 


For the dual problem of packing cycles of any length, called Vertex-Disjoint Cycle Packing 
(VDCP), the results of [451132| imply that the best approximation factor by any polynomial time 
algorithm lies between D(\/logre) and O(logre). In a closely related problem Edge-Disjoint Cycle 
Packing (EDCP), the same papers showed that ©(logre) is the best possible. In directed graphs 
the vertex and edge version have the same approximability, the best known algorithms achieves 
0(-y/re)-approximation while the best hardness result remains D(logre). 


Variants of A:-Cycle Packing have also been considered in the literature. Rautenbach and Re¬ 
gen |50| studied fc-Cycle Edge Packing on graphs with girth k and small degree. Chalermsook et 
ah m studied a variant of A:-Cycle Packing on directed graphs for k > re^/^ where we want to 
pack as many disjoint cycles of length at most k as possible, and proved that it is NP-hard to 
approximate within a factor of re^/^”*^. This matches the algorithm implied by |45j . 


1.2.2 Cliques 

Minimum Maximal (resp. Maximum) Clique Transversal asks to find the smallest subset of vertices 
that intersects every maximal (resp. maximum) clique in the graph. In mathematics, Tuza |54) 
and Erdos et al. [28| started to estimate the size of the smallest such set depending on structure of 
graphs. See the recent work of Shan et al. [52] and references therein. In computer science, exactly 
computing the smallest set on special classes of graphs appears in many works [3511161 da Ea HZ). 

Both the edge and vertex version of fe-Clique Packing also have been studied actively both in 
mathematics and computer science. In mathematics, the main focus of research is lower bounding 
the maximum number of edge or vertex-disjoint copies of Kk in very dense graphs (note that even 
ATs does not exist in Kn,n which has 2re vertices and re^ edges). See the recent paper [56] or the 
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survey [55] of Yuster. The latter survey also mentions approximation algorithms, including APX- 
hardness and the general approximation algorithm for A;-Set Packing which now achieves for 

the vertex version and - for the edge version [22j . Feder and Subi m considered Ff-Edge 

Packing and showed APX-hardness when H is A:-cycle or A:-clique. Chataigner et al. [Tl| considered 
an interesting variant where we want to pack vertex-disjoint cliques of any size to maximize the 
total number of edges of the packed cliques, and proved APX-hardness and a 2-approximation 
algorithm. Exact algorithms for special classes of graphs have been considered in [lOL 1361 [38l I43j . 

1.3 Open Problems 

Eor H-Transversal, 1-connectivity is not sufficient for strong hardness, because /c-Star Transversal 
admits an 0(logA;)-approximation algorithm by Theorem]^ It is open whether 1-connectivity 
is sufficient or not for such strong hardness for H-Packing. /c-Star Packing is at least as hard 
as MIS-A: by a trivial reduction, but the approximability of fe-Path Packing appears to be still 
unknown. Whether A;-Path Transversal admits a factor o{k) approximation algorithm is also an 
intriguing question. For directed acyclic graphs, Svensson |53| proved that it is Unique Games-hard 
to approximate /c-Path Transversal within a factor better than k. 

The approximability of H-Edge Transversal and Ff-Edge Packing is less understood than the 
vertex versions. Proving tight characterizations for the edge versions similar to Theorem is an 
interesting open problem. 


1.4 Organization 


The rest of the main body is devoted to proving Theorem for FF-Transversal / Packing and 
Theorem]^ for MlS-k-g. Sectionrecalls and extends previous hardness results for the problems 
we reduce from; Sections and prove hardness of FF-Transversal and FF-Packing respectively. 
Appendix A gives an 0(log/cj-approximation algorithm for fc-Star Transversal, proving Theorem]^ 
Appendix B proves Theorem]^ to illustrate the connection to EVS. 


2 Preliminary 

Notation. A fc-uniform hypergraph is denoted by P = (Vp,Ep) such that each e G Ep is a 
/c-subset of Vp. We denote e as an ordered fc-tuple e = (u^,... ,v^). The ordering can be chosen 
arbitrarily given P, but should be fixed throughout. If v indicates a vertex of some graph, we use a 
superscript u* to denote another vertex of the same graph, and e* to denote the ith (hyper)edge. For 
an integer m, let [m] = {1,2,..., m}. Unless otherwise stated, the measure of P C U is obtained 
under the uniform measure on V, which is simply |^. 

/s-HVC. An instance of /c-HVC consists of a A:-uniform hypergraph P, where the goal is to find a 
set C C Up with the minimum cardinality such that it intersects every hyperedge. The result of 
Dinur, Guruswami, Khot and Regev [2l| states that 

Theorem 5 (|2l|)- Given a k-uniform hypergraph (k > 3) and e > 0, it is NP-hard to tell apart 
the following cases: 

• Completeness: There exists a vertex cover of measure . 

• Soundness: Every vertex cover has measure at least 1 — e. 
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Therefore, it is NP-hard to approximate k-HVC within a factor fc — 1 + 2e. 


Moreover, the above result holds even when the degree of a hypergraph is bounded by d de¬ 
pending on k and e. See Appendix B.2|for details. 


MIS-A;. Given a graph G = {Vq, Eq), a subset 5 C Vg is independent if the subgraph induced by S 
does not contain any edge. The Maximum Independent Set (MIS) problem asks to find the largest 
independent set, and MIS-A: indicates the same problem where G is promised to have maximum 
degree at most k. The recent result of Chan [12] implies 


Theorem 6 ((Ej). Given a graph G with maximum degree at most k, it is NP-hard to tell apart 
the following cases: 


• Completeness: There exists an independent set of measure n(l/(log A:)). 

• Soundness: Every subset of vertices of measure contains an edge. 

Therefore, it is NP-hard to approximate MIS-k within a factor 


3 i7-Transversal 

In this section, given a 2-connected graph H = (Vh,Eh) with k vertices, we give a reduction 
from A;-HVC to fA-Transversal. The simplest try will be, given a hypergraph P = (Vp,Ep) (let 
n = |yp|,m = \Ep\), to produce a graph G = {Vg,Eg) where Vg = Vp, and for each hyperedge 
e = {v^,...,v^) add \Eh\ edges that form a canonical copy of H to Eg- While the soundness 
follows directly (if E T Vp contains a hyperedge, the subgraph induced by F contains H), the 
completeness property does not hold since edges that belong to different canonical copies may form 
an unintended non-canonical copy. To prevent this, a natural strategy is to replace each vertex by 
a set of many vertices (call it a cloud), and for each hyperedge (u^,... ,v^), add many canonical 
copies on the k clouds (each copy consists of one vertex from each cloud). If we have too many 
canonical copies, soundness works easily but completeness is hard to show due to the risk posed by 
non-canonical copies, and in the other extreme, having too few canonical copies could result in the 
violation of the soundness property. Therefore, it is important to control the structure (number) 
of canonical copies that ensure both completeness and soundness at the same time. 

Our technique, which we call random matching, proceeds by creating a carefully chosen number 
of random copies of H for each hyperedge to ensure both completeness and soundness. We remark 
that properties of random matchings are also used to bound the number of short non-canonical 
paths in inapproximability results for edge-disjoint paths on undirected graphs mm- The details 
in our case are different as we create many copies of H based on a hypergraph. 

Fix e > 0, apply Theorem let c := s := 1 — e be the measure of the minimum vertex 
cover in the completeness and soundness case respectively, and d := d{k, e) be the maximum degree 
of hard instances. Let a and B be integer constants greater than 1, which will be determined later. 
Lemma and with these parameters imply the first half of Theorem 

Reduction. Without loss of generality, assume that Vp = [A:]. Given a hypergraph P = (yp,Ep), 
construct an undirected graph G = (Vg, Eg) such that 

• Vg = Vp X [B]. Let n = \Vp\ and N = IVgI = n-B. For v G Vp, let cloud(u) := {u} x [B] be 
the copy of [B] associated with v. 
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• For each hyperedge e = (f, v^), for aB times, take l^,... ,1^ independently and uniformly 

from [B], For each edge {i,j) € H {1 <i < j < k), add /*), ,P)) to Eg- Each time we 

add \Eh\ edges isomorphic to H, and we have aB of such copies of H per each hyperedge. 
Call such copies canonical. 

Completeness. The next lemma shows that if P has a small vertex cover, G also has a small 
Ff-Transversal. 

Lemma 1. Suppose P has a vertex cover C of measure c. For any e > 0, with probability at least 
3/4, there exists a subset F CVg of measure at most c + e such that the subgraph induced by Vg\F 
has no copy of H. 

Proof. Let F = C x [B]. We consider the expected number of copies of H that avoid F and 
argue that a small fraction of additional vertices intersect all of these copies. Choose k vertices 
{v^,l^),..., {v^, l^) which satisfy 

• G Vp can be any vertex. 

• l^,... ,1^ £ B can be arbitrary labels. 

• For each {i,j) £ there must be a hyperedge of P containing both i and j. 

There are n possible choices for v^, B choices for each F, and at most kd choices for each u* (i > 1). 
The number of possibilities to choose such (u^, ..., {v^, l^) is bounded by n{dk)^B^. Note that 

no other fc-tuple of vertices induce a connected graph and contain a copy of H. Further discard 
the tuple when two vertices are the same. 

We calculate the probability that the subgraph induced by ((u^, F),..., (u^, l^)) contains a copy 
in this order — formally, for all (i,j) G Eh, ((u*,F), {v^,P)) G Eq. For each {i,j) G Eh, we call 
a pair ((u®, F), (u-^, P)) G a purported edge. For a set of purported edges, we say that this 
set can be covered by a single canonical copy if one copy of canonical copy of H can contain all 
purported edges with nonzero probability. Suppose that all \Eh\ purported edges can be covered 
by a single canonical copy of H. It is only possible when there is a hyperedge whose k vertices are 
exactly {v^ ,..., v^}. In this case, ((u^, l^), ..., {v^, l^)) intersects F. (right case of Figure[^. When 
\Eh\ purported edges have to be covered by more than one canonical copy, some vertices must be 
covered by more than one canonical copy, and each canonical copy covering the same vertex should 
give the same label to that vertex. This redundancy makes it unlikely to have all k edges exist at 
the same time, (left case of Figure [^. The below claim formalizes this intuition. 

Claim 1. Suppose that {{v^,P),... ,{v^,l^)) cannot be covered by a single canonical copy. Then 

the probability that it forms a copy of H is at most —. 

Proof. Fix 2 < p < \Eh\. Partition \Eh\ purported edges into p nonempty groups 
such that each group can be covered by a single canonical copy of H. There are at most 
possibilities to partition. For each u G Vp, there are at most d hyperedges containing v and at 
most aBd canonical copies intersecting cloud(u). Therefore, all edges in one group can be covered 
simultaneously by at most aBd copies of canonical copies. There are at most [aBdY possibilities 
to assign a canonical copy to each group. Assume that one canonical copy is responsible for exactly 
one group. This is without loss of generality since if one canonical copy is responsible for many 
groups, we can merge them and this case can be dealt with smaller p. 
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Figure 1: Two examples where k = A and H is a 4-cycle. On the left, purported edges are divided into two 
groups (dashed and solid edges). Each copy of canonical cycle should match the labels of three vertices to 
ensure it covers 2 designated edges (6 labels total). On the right, one canonical copy can cover all the edges, 
and it only needs to match the labels of four vertices (4 labels total). 

Focus on one group I of purported edges, and one canonical copy L = {Vl,El) which is 
supposed to cover them. Let C Vg be the set of vertices which are incident on the edges in I. 
Suppose Vl = l'^), • • •, {u^, which is created by a hyperedge / = ..., u^) G Ep. We 

calculate the probability that L contains all edges in I over the choice of labels for L. 

One necessary condition is that {v\{v,l) G E for some I G [i?]} (i.e. the set I' projected to Vp) is 
contained in /. Otherwise, some vertices of I' cannot be covered by L. Another necessary condition 
is u* 7 ^ for any (u*,r) / ivE^) ^ I'■ Otherwise (i.e. (u,!®), {v,P) G I' for P / P), since L gives 
only one label to each vertex in / C Vp, {v, P) and {v, P) cannot be contained in L simultaneously. 
Therefore, we have a nice characterization of I': It consists of at most one vertex from the cloud 
of each vertex in /. 

The probability that L contains I is at most the probability that for each (u®, P) G I', P is equal 
to the label L assigns to u®, which is L Now we need the following lemma saying that the sum 
of |T| is large, which relies on 2-connectivity of H. 

Lemma 2. Fix p > 2. For any partition of purported edges into p non-empty groups, 

j:u\n\>k+p. 

Proof. Let t be the number of vertices contained in at least two /®s. Call them boundary vertices. 
Note that exactly k — t vertices belongs to exactly one For i = 1, ...,p, let bi be the number 
of boundary vertices in |/'|. Since is a proper subgraph of H and H is 2-vertex connected, 

bi >2 for each i. Therefore, 

p 

|/'| = {k — t) max(2p, 2t) > k p. 

i=l 


□ 












We conclude that for each partition, the probability of having all the edges is at most 




q=l 


{aBd)P _ {ad)P 

J^k+p ~ 


The probability that ((f^, , {v^, Y)) forms a copy is therefore bounded by 


\Eh\ 

E p' 

p=2 


\Eu\ iad)P ^ jadk) 


Bk 


Bk 


□ 

Therefore, the expected number of copies that avoid F is bounded by n{kdYB^ ■ —. With 

probability at least 3/4, the number of such copies is at most An{adkY^^. Let B > —. Then 

these copies of H can be covered by at most enB = eN vertices. □ 

Soundness. The soundness claim above is easier to establish. By an averaging argument, a subset 
/ of Vg of measure 2e must contain eB vertices from the clouds corresponding to a subset S of 
measure e in Vp. There must be a hyperedge e contained within S, and the chosen parameters 
ensure that one of the canonical copies corresponding to e is likely to lie within I. 

Lemma 3. For a = a{k, e) and B = n(log \ Ep\), if every subset ofVp of measure at least e contains 
a hyperedge in the induced subgraph, with probability at least 3/4, every subset ofVc with measure 
2e contains a canonical copy of H. 

Proof. We want to show that the following property holds for every hyperedge e = (u^,..., v^): if a 
subset of vertices I C Vg has at least e fraction of vertices from each cloud (u*), then I will contain 
a canonical copy. Fix C cloud(u^),... ,A^ C cloud(u^) be such that for each i, \A^\ > eB. There 
are at most 2^^ ways to choose such ^’s. The probability that one canonical copy associated with e 
is not contained in (u^, x • • • x (u^, A^) is at most 1 — e^. The probability that none of canonical 
copy associated with e is contained in x • • • x {y^,A^) is (1 — e^Y^ < exp(—aBe^). 

By union bound over all A^,... ,A^, the probability that there exists A^,... ,A^ containing no 
canonical copy is at most exp(/ci? — aBe^) = exp(—i?) < by taking a large enough constant 
depending on k and e, and B = n(log|i?p|). Therefore, with probability at least 3/4, the desired 
property holds for all hyperedges. 

Let / be a subset of Vq of measure at least 2e. By an averaging argument, at least e fraction of 
good vertices u G Vp satisfy that |cloud(u*) H /| > eB. By the soundness property of P, there is a 
hyperedge e contained in the subgraph induced by the good vertices, and the above property for e 
ensures that I contains a canonical copy. □ 


4 i7-Packing and llAlS-k-g 

Given a 2-connected graph H, the reduction from MIS-fc-A: to iL-Packing is relatively straightfor¬ 
ward. Here we assume that hard instances of MIS-fc-A: are indeed A:-regular for simplicity. Given 
an instance M = {Vm,Em) of MIS-A:-A:, we take G = {Vg,Eg) to be its line graph — Vq = Em, 
and e, f £ Vg are adjacent if and only if they share an endpoint as edges of M. 
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For each vertex v G Vm, let star(u) := {e £ Vg ■ v G e}. star(r;) induces a fc-clique, and 
for v,u £ Vm, star(u) and star(u) share one vertex if u and v are adjacent, and share no vertex 
otherwise. Given an independent set S of M, we can find |5| pairwise disjoint stars in G, which 
gives |5| vertex-disjoint copies of H. On the other hand, 2-connectivity of H and large girth of 
M implies that any copy of H must be entirely contained in one star, proving that many disjoint 
copies of Ff in G also give a large independent set of M with the same cardinality, completing the 
reduction from MlS-k-k to FF-Packing. The following theorem formalizes the above intuition. 

Lemma 4. For a 2-connected graph H with k vertices, there is an approximation-preserving re¬ 
duction from MIS-k-k to H-Packing. 

Proof. Let M = (Vm, Em) be an instance of MIS-k-k M with maximum degree k and girth greater 
than k. First, let G = (Vg = Em,Eg) be the line graph of M. For each vertex v £ Vm with 
degree strictly less than k, we add k — deg(u) new vertices to Vq. Let star(u) C Vq be the union 
of the edges of M incident on v and the newly added vertices for v. Note that | star(u)| = k for all 

V £ Vm- Add edges to G to ensure that every star(u) induces a /c-clique. For two vertices u and 

V of M, star(u) and star(u) share exactly one vertex if u and v are adjacent in M, and share no 
vertex otherwise. 

Let S be an independent set of M. The |5| stars {star(u)}^,g 5 are pairwise disjoint and each 
induces a fc-clique, so G contains at least liSI disjoint copies of FF. 

We claim that any A:-subset of Vq that induces a 2-connected subgraph must be star(u) for some 
V. Assume towards contradiction, let T be a fc-subset inducing a 2-connected subgraph of G that 
cannot be contained in a single star. We first show T must contain two disjoint edges of M. Take 
any (u,v) £ T. Since T ^ star(u), T contains an edge of M not incident on u. If it is not incident 
on V either, we are done. Otherwise, let (w,v) be this edge. The same argument from T ^ star(u) 
gives another edge (w', u) in T. li w ^ w', (w, v) and (w', u) are disjoint. Otherwise, w,u,v form a 
triangle in M, contradicting a large girth. Let (u, v), (w, x) be two disjoint edges of M in contained 
in T. 

Since the subgraph of G vertex-induced by T is 2-connected, there are two internally vertex- 
disjoint paths Pi, P 2 in G from {u,v) to (w,x). The sum of the two lengths is at most k, where 
the length of a path is defined to be the number of edges. By considering the internal vertices of Pi 
(edges of M) and deleting unnecessary portions, we have two edge-disjoint paths P{, P 2 in M where 
each P- connects {u,v} and {rc,x}, with length at most the length of Pi minus one. There is a 
cycle in M consists only of the edges of P[, P 2 together with (u, v), (w, x). Since |P{| -|- IP 2 I -\-2 <k, 
it contradicts that M has girth strictly greater than k. □ 

We prove that MlS-k-g is also hard to approximate by a reduction from MIS-d (d = Q(k)), using 
a slightly different random matching idea. Given a degree-d graph with possibly small girth, we 
replace each vertex by a cloud of B vertices, and replace each edge by a copies of random matching 
between the two clouds. While maintaining the soundness guarantee, we show that there are only 
a few small cycles, and by deleting a vertex from each of them and sparsifying the graph we obtain 
a hard instance for MlS-k-g. Note that g does not affect the inapproximability factor but only the 
runtime of the reduction. 

Theorem 7 (Restatement of Theorem |^. For any constants k and g, unless NP C BPP, no 
polynomial time algorithm approximates MIS-k-g within a factor of fc )- 

Proof. We reduce from MIS-d to MIS-fc-gi where k = 0(dlog^ d). Given an instance Go = (Vgq,Egq) 
of MIS-d, we construct G = (Vg, Eg) and G' = {Vg’, Eg>) by the following procedure: 
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• Vg = Vgo X [B], As usual, let cloud(u) = {u} x [S], 

• For each edge {u,v) G Egq, for a times, add a random matching as follows. 

— Take a random permutation vr ; [B] —)■ [B], 

— Add an edge ((u, i), (u,7r(z)) for all i G [i?]. 

• Call the resulting graph G. To get the final graph G', 

— For any cycle of length at most g, delete an arbitrary vertex from the cycle. Repeat 
until there is no cycle of length at most g. 

Note that the step of eliminating the small cycles can be implemented trivially in time 0{n^). Let 
n = I^Gol)^ = l-^Gol)-^ = = I^gI > = m - aB = \Eg\ > \Egi\. The maximum degree 

of G and G' is at most ad. By construction, girth of G' is at least <7 + 1. 

Girth Control. We calculate the expected number of small cycles in G, and argue that the number 
of these cycles is much smaller than the total number of vertices, so that \Vg\ and IVg'I are almost 
the same. Let k’ be the length of a purported cycle. Choose k' vertices (u^, Z^),..., {v^ , ) which 

satisfy 

• G Vgo can be any vertex. 

• For each 1 <i <k', G Egq- 

• ... ,1^' G B can be arbitrary labels. 

There are n possible choices for B choices for each Z*, and d choices for each u* {i > 1). The 
number of possibilities to choose such (u^, Z^),..., {v^ , Z^ ) is bounded by ncZ^ ~^B^ . Without loss 
of generality, assume that no vertices appear more than once. 

For each edge e = {u, w) G Gq, consider the intersection of the purported cycle ((u^, Z^),..., (v^', l^')) 
and the subgraph induced by cloud (tt) U cloud (re). It is a bipartite graph with the maximum degree 
2. Suppose there are q purported edges e^,..., e'^ (ordered arbitrarily) in this bipartite graph. By 
slightly abusing notation, let e* also denote the event that e* exists in G. The following claim upper 
bounds Pr[e*|e^,..., for each eL 

Claim 2. Pr[e*|e\ ..., e*"^] < ;g^. 

Proof. There are a random matchings between cloud (w) and cloud (rc), and for each j < i, there 
is at least one random matching including eP We fix one random matching and calculate the 
probability that the random matching contains e*, conditioned on the fact that it already contains 
some of e^,..., 

If there is (j < i) that shares a vertex with e*, e* cannot be covered by the same random 
matching with eP If a random matching covers p of e^,..., which are disjoint from e*, the 
probability that e* is covered by that random matching is and this is maximized whenp = Z—1. 

By a union bound over the a random matchings, Pr[e*|e^,..., < ;g^. □ 

The probability that all of e^,..., e”? exist is at most 
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Since edges of Go are processed independently, the probability of success for one fixed purported 
cycle is . The expected number of cycles of length k' is 


<nd^'-^a^' expf < en{adf 
\B — k / 


k' 


B-k' 


by taking B — k' > k'"^. Summing over k' = the expected number of cycles of length up to 

g, is bounded by eg{ad)^n. Take B > • eg{ady. Then with probability at least 3/4, the number 

of cycles of length at most g is at most ^. By taking 1/d^ fraction of vertices away (one for each 

short cycle), we have a girth at least g + I, which implies IVgI < \ Vg'\ < Ihbl- 

Hardness of MIS-d states that it is NP-hard to distinguish the case Go has an independent set 
of measure c := and the case where the maximum independent set has measure at most 

»:= 0 (! 4 ^), 

Completeness. Let do be an independent set of Go of measure c. Then I = Iq x [B] is also an 
independent set of G of measure c. Let I' = I (1 Vc- I' is independent in both G and G', and the 
measure of I' in G' is at least the measure of I' in G, which is at least c — 1/d^ = 

Soundness. Suppose that every subset of Vgq of measure at least s contains an edge. Say a graph 
is {I3,a)-dense if we take (3 fraction of vertices, at least a fraction of edges lie within the induced 
subgraph. We also say a bipartite graph is {13, a)-bipartite dense if we take (3 fraction of vertices 
from each side, at least a fraction of edges lie within the induced subgraph. 

Claim 3. For a = b = the following holds with probability at least 3/4.- 

For every {u,w) G Egq, the bipartite graph between cloud(u) and cloud(r(;) is {e,e‘^/8)-bipartite 
dense for all e > s. 


Proof. Fix {u, w), and e G [s, 1], and X C cloud(tt) and Y C cloud(tc) be such that |X| = |y| = eB. 
The possibilities of choosing X and Y is 



< exp(0(elog(l/e)H)) 


Without loss of generality, let X = T = [eB]. In one random matching, let Xi{i G [eB]) be 
the random variable indicating whether vertex {u, i) G X is matched with a vertex in Y or not. 
Pr[Xi = 1] = e, and Pr[Xj = l|Xi,..., Xj_i] > e/2 for i G [eB/2] and any Xi,..., Xj_i. Therefore, 
the expected number of edges between X and Y is at least e^B/A. With a random matchings, the 
expected number is at least ae^B jA. By Chernoff bound, the probability that it is less than ae^BjS 
is at most exp(^|^). By union bound over all possibilities of choosing X and Y, the probability 
that the bipartite graph is not (e, e^/8)-bipartite dense is 

exp(elog(lA)B).exp(-—j< — 

by taking a = ) and B = . A union bound over all possible choices of e (B 

possibilities) and m edges of Eq implies the claim. □ 

Claim 4. With the parameters a and B above, G is {Aslog{l/s),id{^))-dense. 
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Proof. Fix a subset S of measure 4slog(l/s). For a vertex v of Go, let ^{v) := ^ 

that E„[^(u)] = 4slog(l/s). Partition Vgq into t + 1 buckets {t := |'log 2 (l/s)]), such 

that Bq contains v such that /i(u) < s, and for i>\,Bi contains v such that //(u) G ( 2 *“^s, 2 *s]. 
Denote 


^x{Bi) := 


I^Gol 


Clearly h{Bq) < s. Pick z G {1,..., t} with the largest fJ.{Bi). We have fi{Bi) > 2s since E^[/r(u)] > 
4slog(l/s). Let 7 = 2*“^s. All vertices of Bi has fi{v) G [ 7 , 27 ], so \Bi\ > {s/^)n. 

Since Go has no independent set with more than ns vertices, Turan’s Theorem says that the 
subgraph of Go induced by Bi has at least “ 1) = edges. This is at least D(^) 

fraction of the total number of edges. 

For each of these edges, by Claim at least 7^/8 fraction of the edges from the bipartite graph 

connecting the clouds of its two endpoints, lie in the subgraph induced by S (since 7 > s). Overall, 

2 

we conclude that there are at least D(^) • ^ = fraction of edges inside the subgraph induced 
by S. □ 


Sparsification. Recall that G' is obtained from G by deleting at most ^ fraction of vertices to 
have girth greater than g. In the completeness case, G' has an independent set of measure at least 
c— 1/d^ = In the soundness case, G is (4slog(l/s), D(|))-dense, so G' is (/3,a)-dense 

where (3 := o; := ^ ). Using density of G', we sparsify G' again — keep each edge of 

G' by probability so that the expected total number of edges is kn. 

Fix a subset S C Vc of measure (3. Since there are at least a fraction of edges in the subgraph 
induced by S, the expected number of picked edges in this subgraph is at least akn. By Chernoff 
bound, the probability that it is less than is at most exp(—By union bound over all sets 
of measure exactly f3 (there are at most (J^) < exp(2/31og(l//3)n) of them), and over all possible 
values of /3 (there are at most n possible sizes), the desired property fails with probability at most 

n- rnax ^{exp(—aA:n/32) • exp(2/31og(l//3)n)} < n • e”"" 

when k = = 0{dlog‘^ d). In the last step we remove all the vertices of degree more 

than lOfc. Since the expected degree of each vertex is at most 2k, the expected fraction of deleted 
vertices is exp(—D(/c)) <C /3. 

Combining all these results, we have a graph with small degree 10/c = 0{dlog^ d) and girth 

1 

strictly greater than g, where it is NP-hard to approximate MIS within a factor of ^ = ^( j^gS ^ ) = 
^( log^fc )' Therefore, it is NP-hard to approximate MlS-k-g within a factor of D( ^^^y^ ). □ 
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A Approximation Algorithm for /c-Star Transversal 

In this section, we show that /c-Star Transversal admits an 0(log A:)-approximation algorithm, 
matching the H(log A:)-hardness obtained via a simple reduction from Minimum Dominating Set on 
degree-(fe — 1) graphs [IT], and proving Theorem]^ Let G = {Vg,Eg) be the instance of A:-Star 
Transversal. This problem has a natural interpretation that it is equivalent to finding the smallest 
F C Vg such that the subgraph induced by Vg \ has maximum degree at most k — 2. Our 
algorithm consists of two phases. 

1. Iteratively solve 2-rounds of Sherali-Adams linear programming (LP) hierarchy and put ver¬ 
tices with a large fractional value in the transversal. If this phase terminates with a partial 
transversal F, the remaining subgraph induced by Vg \ F has small degree (at most 2k) and 
the LP solution to the last iteration is highly fractional. 

2. We reduce the remaining problem to Constrained Set Multicover and use the standard greedy 
algorithm. While the analysis of the greedy algorithm for Constrained Set Multicover is used 
as a black-box, low degree of the remaining graph and high fractionality of the LP solution 
imply that the analysis is almost tight for our problem as well. 


Iterative Sherali-Adams. Given G, 2-rounds of Sherali-Adams hierarchy of LP relaxation has 
variables {xv}veVG U {xu,v}u,v&Vg- integral solution y '.Vg {0,1}, where y{v) = 1 indicates 
that V is picked in the transvesal, naturally gives a feasible solution to the hierarchy by Xy = y^, 
Xu,v = UuUv Consider the following relaxation for fc-Star Transversal. 
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minimize 




hGVc 


subject to 


Vu, V G Vg 


^U,V — 

Vu, V G Vg 


Xu H“ X-D Xu.^v ^ 1 

Vu, V G Vg 


(x^ - > (deg(u) - A: + 2)(1 - x„) Vu G Vg 

v:(u^v)^Eq 


The first three constraints are common to any 2-rounds of Sherali-Adams hierarchy, and ensure 
that for any u,v G Vq, the local distribution on four assignments a : {u,v} i—)■ {0,1} forms a valid 
distribution. In other words, the following four numbers are nonnegative and sum to 1: Pr[Q;(u) = 
a{v) = 1] := Xu,v, Pr[a(u) = 0,a{v) = 1] := x„ - Xu,v, Pr[a(u) = l,a(u) = 0] ;= x„ - Xu,v, 
Pr[Q;(n) = a{v) =0] := 1 — x^^ — x^ -|- Xu,v 

The last constraint is specific to /c-Star Transversal, and it is easy to see that it is a valid 
relaxation: Given a feasible integral solution y : Vq {0,1}, the last constraint is vacuously 

satisfied when Vu = Xu = 1 , and if not, it requires that at least deg(n) — A + 2 vertices should be 
picked in the transversal so that there is no copy of /c-Star in the induced subgraph centered on u. 
The first phase proceeds as the following. 

• Let 5-^0. 

• Repeat the following until the size of S does not increase in one iteration. 

— Solve the above Sherali-Adams hierarchy for Vq \ S — it means to solve the above 
LP with additional constraints x^, = 1 for all v G S, which also implies Xu,v = Xu for 
V G S,u G Vg- Denote this LP by SA(5). 

— 5 {x : x„ > ^}, where a := 10. 

We need to establish three properties from the first phase: 

• The size of S is close to that of the optimal A:-Star Transversal. 

• Maximum degree of the subgraph induced by Vg \ <5 is small. 

• The remaining solution has small fractional values — Xy < ^ for all x G Vg \ S'. 

The final property is satisfied by the procedure. The following two lemmas establish the other 
two properties. 

Lemma 5. Let Frac be the optimal value o/SA(0). When the above procedure terminates, [S'! < 

a Frac. 

Proof. Assume that the above loop iterated I times, and for i = 0,..., I, let Si be S after the ith loop 
such that So = 0,...,5; = S. We use induction from the last iteration. Let Frac, be the optimal 
fractional solution to SA(5j) minus IS"*! such that Frac = Fraco- 


18 



We first establish |5;| — |5;_i| < a Frac;_i. This is easy to see because, when x is the optimal 
fraction solution to SA(5;_i), 

\Si\ - |5«_i| = \{v ^ Si-i : > -}| < a Frac^-i . 

a 

For i = l — — 0, we show that |5/| — |S'j| < a Frac*. Let x be the optimal fraction solution 

to SA(5i), and x' be the solution obtained by partially rounding x in the following way. 


• = 1 if u G S'i. Otherwise, x'^ = x^. 

• ^'u,v = ^'u ('^ ^ •S'i)) ^'v {u ^ Si), ox Xu,v otherwise. 

It is easy to check that it is a feasible solution to SA(S'i+i) (intuitively, rounding up only helps 
feasibility), so its value is 

\Si\ + ^ Xy> |5i| + Fracj+i, 

v^Si,Xy<)^ 


which implies 


Frac, = 


E 


E 


v^Si 


u>- 
— a 


v^Si. 


> -(|5i+i| - \Si\) + Fraci+i 

a 


Finally, we have 


\Si\ - |5i| 

< aFraCi+i+(|5i+i| - \Si\) 

< a Fracj, 

where the hrst inequality follows from the induction hypothesis. This completes the induction. □ 

Lemma 6. After the termination, every vertex has degree at most 2k in the subgraph indueed by 
Vg\S. 

Proof. We prove that at least one vertex is added to S if the subgraph induced by Vg \ S' has a 
vertex of degree more than 2k. Fix one such iteration, and let Si and S 2 be S before and after the 
iteration respectively. Let G' be the subgraph of G induced by Vg \ . If the subgraph induced 

by Fg \ <S '2 does not have any vertex with degree more than 2k, we are done. Otherwise, hx one 
such vertex u G Vg\ S 2 . Note that the degree of u in G' is also more than 2k. 

We show that at least one neighbor v of u satishes v ^ Si but v G S 2 . Let x be the optimal 
fractional solution to SA(5i) and consider the following constraint for u. 

> (deg(u) - A: + 2)(1 - x„). 

v:{fa.,v)^EQ 


Let Nbr(u) and Nbr'(u) be the set of neighbors of tt in G and G' respectively, and deg'(ii) = | Nbr'(u)|. 
Note that Nbd(u) = Nbr(ii) \ Si, and for v G Nbr(u) n 5i, x„ = 1 and Xu,v = Xu- Therefore, the 
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above constraint is equivalent to 


^ (l-x„)+ ^ {Xy - Xu,v) > {^eg{u) - k + 2){1 - Xu) 

■i;:Nbr(ti)nSi D:Nbr'(u) 

4 ^ ^ {xy - Xu,v) > (deg'iu) - k+ 2){1 - Xu). 

v\Nbr'{u) 

The fact that u ^ S 2 implies that Xu < ^, which implies 

^ Xu 
v£Nbr' {u) 

> {xu - Xu,v) > {I --)ideg'{u) - k) = (1 - -)deg^(n)(l - ^ ). 

“ a a deg (u) 

v^Nbr' {u) 

Therefore, there is one v G Nbr'(M) with Xy > {1 — ^)(1 — jegh^) ) — TO ' 2 ^ satisfies v ^ Si 
but V £ 82. □ 

Constrained Set Multicover. The first phase returns a set S whose size is at most a times 
the optimal solution and the subgraph induced by Vg \ ^ has maximum degree at most 2k. As 
above, let G' be the subgraph induced by Vg \ S, Nbr(tt), Nbr'(n) be the neighbors of n in G and 
G' respectively, and deg(n) = | Nbr(n)|,deg'(n) = | Nbr'(n)|. The remaining task is to find a small 
subset F CVg\S such that the subgraph of G' (and G) induced by Vg \ (S' U F) has no vertex of 
degree at least k — 1. We reduce the remaining problem to the Constrained Set Multicover problem 
defined below. 

Definition!. Given an set system U = {ei, ...,en}, a collection of subsets C = {Gi, ...,Gm}, and a 
positive integer rg for each e £ U, the Constrained Set Multicover problem asks to find the smallest 
subcollection (each set must be used at most once) such that each element e is covered by at least 
Tg times. 

Probably the most natural greedy algorithm does the following: 

• Pick a set C with the largest cardinality (ties broken arbitrarily). 

• Set Tg •(— Te — 1 for e G C. If rg = 0, remove it from U. For each C G C, let C •(— C H C/. 

• Repeat while U is nonempty. 

Constrained Set Cover has the following standard LP relaxation, and Rajagopalan and Vazi- 
rani [1^ showed that the greedy algorithm gives an integral solution whose value is at most 
(i.e. the dth harmonic number) times the optimal solution to the LP, where d is the maximum set 
size. 

minimize E 
cec 

subject to zc > Ve e £ U 

c-.e&c 

0<ZG<1 c £C 
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Our remaining problem, A:-Star Transversal on G', can be thought as an instance of Constrained 
Set Cover in the following way: U := {u G Vg\S : deg'(u) > A: — 1} with := deg'(n) — A + 2, and 
for each v G Vg\S, add Nbr'(n) H17 to C. Intuitively, this formulation requires at least neighbors 
be picked in the transversal whether u is picked or not. This is not a valid reduction because 
the optimal solution of the above formulation can be much more than the optimal solution of our 
problem. However, at least one direction is clear (any feasible solution to the above formulation is 
feasible for our problem), and it suffices to show that the above LP admits a solution whose value 
is close to the optimum of our problem. The LP relaxation of the above special case of Constrained 
Set Cover is the following: 

minimize 

v£Vg\S 

subject to ^ Zv > deg'(u) — /c + 2 u G U 

v.v^Nbr' {u) 

0<2:i,<l V G Vg\ S 

Consider the last iteration of the first phase where we solved SA(S'). Let x be the optimal 
solution to SA(5) and Frac := ~ I'^l- Note that Xy < ^ when v ^ S. Define {yv}v£V\S such 

that Uv := 2 xv 

Lemma 7. {yy} is a feasible solution to the above LP for Constrained Set Cover. 

Proof. By construction 0 < < |:, so it suffices to check for each u G U, 

^ yv> deg'(u) -k + 2. 

v:v£Nbr' (u) 

Fix u G U. Recall that Sherali-Adams constraints on x imply that 

{xy - Xu,v) > {deg'{u) - k+ 2){l - Xu) 

v:Nbr'{u) 

=> ^ > (deg'(u) - A: + 2)(1 - x„) 

i;;Nbr^('u) 

^ 2xy > deg'(u) — A: + 2, 

i;;Nbr^('u) 

where the last line follows from the fact that 1 — - > i. □ 

a 2 

Therefore, Constrained Set Cover LP admits a feasible solution of value 2 Frac, and the greedy 
algorithm gives a A:-Star Transversal F with |T| < 2 • Frac-772fc- Since Frac is at most the size of 
the optimal A:-Star Transversal for G' (and clearly G), \S U F\ is at most 0(log A:) times the size of 
the smallest A-Star Transversal of G. 


B Hardness for Longer Cycles and Connection to FVS 

We introduce several notations convinient for cycles. Given an integer k and i, let (i) denote the 
integer in [k] such that i = (i) mod k — the choice of k will be clear in the context. Recall that we 
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use superscripts u* and e* to indicate a vertex and an edge of a graph, respectively. In some cases 
in this section, a vertices is represented as a vector (i.e. in n-dimensional hypercube, V = {0,1}"" 
and each vertex u = (ui,..., u„) is a n-dimensional vector). A subscript Vi is used to denote the iih. 
coordinate of v in this case. 

B.l Proof of Theorem [ 4 ] 

We prove Theorem which improves Theorem in the sense that in the completeness case, a 
small subset F <Z Vq intersects not only cycles of length exactly k, but also all cycles of length 
3,4,..., O( io^g^gn )' reduction and the soundness analysis are exactly the same. We show the 
following lemma for the completeness case which is again almost identical to Lemma[^ but carefully 
keeps track of parameters to consider cycles of increasing length. 

Lemma 8. Suppose P has a vertex eover C of measure c. For any e > 0, with probability at least 
3/4, there exists a subset F CVq of measure at most c -|- e such that the induced subgraph Va\ F 
has no cycle of length The constant hidden in O depends on k, e and the degree d of P. 

Proof. Let F = C x\B\. We consider the expected number of cycles that avoid F and argue that a 
small fraction of additional vertices intersect all of these cycles. Let k' be the length of a purported 
cycle. Choose k' vertices {v^, l ^),..., {v^ , l^ ) which satisfy 

• G Vp can be any vertex. 

• l^,... ,1^' £ B can be arbitrary labels. 

• For each 1 < i < k', there must be a hyperedge e = {u ^,..., u^) and j G [k] such that (u* = 

and or (u* = and = u^). Equivalently, there are edges between 

cloud(u*) and cloud 

There are n possible choices for v^, B choices for each F, and 2d choices for each u® (z > 1) (there are 
at most d hyperedges containing one vertex, and for each canonical cycle, there are two possibilities 
to choose a neighbor). The number of possibilities to choose such /^),..., ,1^ ) is bounded 

by n{2d)^ ~^B^ . Note that no other fc^tuple of vertices can form a cycle. Further discard the 
tuple when two vertices are the same (the resulting cycle is not simple and its simple pieces will be 
considered for smaller k'). 

We calculate the probability that ((u^, /^),..., {v^ il^ )) forms a cycle (i.e. all k' edges exist) 
that does not intersect F. For a set of purported edges, we say that this set can be covered 
by a single canonical cycle if one copy of canonical cycle can contain all k' edges with nonzero 
probability. Suppose that all k' edges in the purported cycle can be covered by a single canonical 
cycle. It is only possible when k' = k and there is a hyperedge e such that after an appropriate 
shifting, e = (recall that e is considered to be an ordered /c-tuple). In this case, 

{{v^,l^),, {v^,l^)) intersects F (right case of Figure [^. When k' edges of the purported cycle 
have to be covered by more than one canonical cycle, some vertices must be covered by more than 
one canonical cycle, and each canonical cycle covering the same vertex should give the same label 
to that vertex. This redundancy makes it unlikely to have all k' edges exist at the same time (left 
case of Figure]^. The below claim, similar to Claimbut desginated for cycles to obtain better 
parameters, formalizes this intuition. 
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Claim 5. Suppose that l ^),..., {v^', l^')) cannot be covered by a single canonical cycle. Then 
the probability that it forms a cycle is at most . 


Proof. Fix 2 < p < k'. Partition k' purported edges into p nonempty groups Ii,..., Ip such that 
each group can be covered by a single canonical cycle. There are at most p^ possibilities to 
partition. For each v G Vp, there are at most d hyperedges containing v and at most aBd canonical 
cycles intersecting cloud(u). Therefore, all edges in one group can be covered simultaneously by at 
most aBd copies of canonical cycles. There are at most {aBdY possibilities to assign a canonical 
cycle to each group. Assume that one canonical cycle is responsible for exactly one group. This is 
without loss of generality since if one canonical cycle is responsible for many groups, we can merge 
them and this case can be dealt with smaller p. 

Focus on one group I of purported edges, and one canonical cycle L which is supposed to 
cover them. Let I' C Vg be the set of vertices which are incident on the edges in I. Suppose 
L = l'^), ..., (tt^, l'^)), which is produced by a hyperedge / = {u^ ,..., u^) G Ep. We calculate 

the probability that L contains all edges in I over the choice of labels l’^, ..., l'^ for L. One necessary 
condition is that 

{v\{v,l) G I' for some Z G [B]'j 

(i.e., the set T projected to Vp) is contained in /. Otherwise, some vertices of I' cannot be 
covered by L. Another necessary condition is u* 7^ for any (u*,Z*) 7^ {v^,P) G I'. Otherwise 
((u,Z*), {v,P) G I' for P 7^ P), since L gives only one label to each vertex in / C Vp, {y,T) and 
{v,P) cannot be contained in L simultaneously. Therefore, we have a nice characterization of I': 
It consists of at most one vertex from the cloud of each vertex in /. 

Now we make a crucial observation that \I'\ > |/| +1. This is because / is a proper subset of the 
edges that form a simple cycle. Formally, in the graph with vertices I' and edges /, the maximum 
degree is at most 2, and there are at least two vertices of degree 1. The probability that L contains 
I is at most the probability that for each (u*, P) G I\ P is equal to the label L assigns to u*, which 
is B-\^'\ < B-\^\-^. 

We conclude that for each partition, the probability of having all the edges is at most 

n 

q=l 

The probability that ((u^, P ),..., , P')) forms a cycle is therefore bounded by 


k' 

E 

p=2 




{ad)P 

B^' 


,mdk\k' 


□ 


Therefore, the expected number of cycles of length k' that avoid F is bounded by n{2d)^'~^B^' • 
k'{^^)k < n{Rk')^ where i? is a constant depending only on a and d (both are independent of k'). 
With probability at least 3/4, the number of such cycles of length up to k' is at most A.n{Rk')^'^^ . 
Let B > ^ Then these cycles can be covered by at most cnB = cN vertices. If k' = 

then k'^' = exp(/c'log/c') is also o(n), we can take B linear in n and k' > lo'g io^at )■ 
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B.2 Hardness of fc-HVC with Bounded Degree and Density 

In this subsection, we observe the implicit properties of the best known hardness result for k- 
HVC [211 such as bounded degree and density. Only bounded degree is needed for our main Theo¬ 
rem as well as its extension Theorem]^ for cycles proved in the previous subsection, while another 
derandomized proof of hardness in the next subsection requires density property as well. The 
following is the theorem explicitly stated in [24] , 

Theorem 8 ([21])- [Restatement of Theorem^ Given a k-uniform hypergraph (k >3) and e > 0, 
it is NP-hard to tell apart the following cases: 

• Completeness: There exists a vertex cover of measure c := 

• Soundness: Every vertex cover has measure at least s := 1 — e. 

Therefore, it is NP-hard to approximate k-HVC within a factor k — 1 + 2e. 

In some cases, we need a fact that a given hypergraph P has small degrees (only function of k 
and e) as well as the following additional density property in the soundness case. In a hypergraph, 
we define the degree of a vertex to be the number of hyperedges containing it. 

• The maximum degree of P is bounded by d. 

• In the soundness case above, every set of measure at least 5 > 0 contains p > 0 fraction of 
hyperedges in the induced subgraph. 

For example, the NP-hardness of A:-HVC with c=|,s = l — |,d = 2^^, 5 = |, and p = 
for some /3 is made explicit in m- However, careful examination of other results, especially that 
of |21j, yields a better result. 

Theorem 9 (|2l])' For any rational e > 0, Theorem^ holds with d = 0(1), 5 > 0, p > 0 are some 
constant depending on k and e. 

Proof. Theorem 4.1 of |2l| requires a multi-layered PCP with parameters I (number of layers) and 
R (number of labels), which both depend on k and e. Note that in the original Raz verifier, the 
degree dji is a function of R. Given a Raz verifier which consists of a bipartite graph G = (Vg, Eq) 
such that Vg = Y L) Z, Theorem 3.3 yields a multilayered PCP where variables of layer i are 
of the form {zi,..., Zi,yi^i,... ,yi) where Zj £ Z and yj £ Y. The number of labels for any 
vertex is bounded by RK For i < j, there exists a constraint between {zi,..., Zi,yi+i,... ,yi) and 
(4> • ■ • > ^pV'j+i^ • • • > y'l) if and only if 

• Zq = z'q where q < i. 

• yq = y'q where q > j. 

• {yq, z'g) £ Eg (oi i < q < j. 

Therefore, the degree is at most l{dji)\ which is still a function of k and e. After the reduction 
from a multilayered PCP to a weighted hypergraph, the degree of each vertex is still bounded by 
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a function of k and e, since each variable of the PCP is replaced by at most 2^* vertices and each 
PCP constraint is replaced by at most 2^^ hyperedges. 

Given such a weighted instance, we convert it to an unweighted instance by duplicating vertices 
according to their weights. The weight of each vertex in the ith layer is of the form 


1 

Wl 


/(I -p) 


Ri-r 


where Xi = is the set of vertices in the ith layer, Ri = is the number of labels in 

ith layer, p = 1 — and 0 < r < The original paper set the weight as above so that the 

sum of weights becomes 1. Multiply weight of each vertex by \Yi\^ so that the weight of each vertex 
in the ith layer is of the form 


1 

1 



p"(l-p)^”-" 


Let a be a rational that divides both p and 1 — p with both quotients bounded. Then 
divides any p^(l — p)^*“^ as well with quotient bounded by a function of e and k. Therefore, if we 
set the minimum weight to be 

1 

I ■ 

the weight of each vertex must be divisible by the minimum weight, and the quotient will be 
bounded by a function of k and e. We replace each weighted vertex by (weight / minimum weight) 
number of unweighted vertices, and for each hyperedge (u^,..., add all hyperedges {u ^,..., u^) 
where is a copy u*. Since each quotient and the original degree of the weighted instance are 
bounded by a function of k and e, so is the degree of the unweighted instance. 

Now we have an unweighted problem with completeness c, soundness s, and degree bounded by 
d. Let 6 = 2(1 — s). Suppose in soundness case, we have 1 — 5 fraction of vertices cover more than 
1 — fraction of hyperedges. Cover the remaining hyperedges with one vertex each. Since 

\Ep\ < f |Lp|) this process requires less than • ^ = 1 — s fraction of vertices, and we have a 

vertex cover of measure less than 1 — 5 + (1 — s) = s. This contradicts the original soundness, so any 
5 := 2(1 — s) fraction of vertices should contain at least P ■= ^ fraction of edges, both depending 
only on k and e. □ 


B.3 Labeling Gadget 


We now give another proof of hardness of fe-Cycle Transversal. It is weaker than Theorem in 
the sense that a small subset intersects cycles of length at most O (log log n) in the completeness 
case while in Theorem 4, we are able to intersect cycles of length O( io'giog„ ))- However, it has an 
advantage of being derandomized so that the result assumes only P 7 ^ NP instead of NP ^ BPP. 
It crucially uses the fact that graphs are directed, so we hope that further improvements on this 
technique will allow more progress on hardness of TVS, which is hard to approximate only on 
directed graphs. 


Theorem 10. Fix an integer k > 3 and e £ (0,1). Given a directed graph G = {Vg-,Eg), unless 
NP C P, there is no polynomial time algorithm that distinguishes between the following two cases. 


• Completeness: There exists F C Vg with + e fraction of vertices that intersects every 
cycle of length at most O(loglogn) (hidden constant in O depends on k and e). 
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• Soundness: Any subset with more than e fraction of vertices has a cycle of length exaetly k 
in the induced subgraph. 

Intuition. We call this technique labeling gadget, which explicitly controls the structure of every 
cycle. The idea of labeling gadgets to prove hardness of approximation has been used previously to 
show inapproximability of edge-disjoint paths problem with congestion and directed cut problems [3l 

EOIEI]. 

In this work, the labeling gadget is a directed graph L = {Vl,Ei) with roughly the following 
properties: (i) its girth is k, and (ii) every subset of vertices of measure at least <5 has at least one 
cycle of length k. 

To highlight the main idea, we introduce a valid reduction from TVS that increases the size 
of instances exponentially — the actual proof increases the size polynomially but only works for 
cycles of bounded length. Given a hypergraph P and the labeling gadget L, G = {Vg,Eg) is 
constructed in the following way. Vg = Vp x Vff, where m is the number of hyperedges in P (say 
the hyperedges are e^,e^,... ,6™), so that the cloud for each vertex v £ Vp becomes Vff'. Each 
copy of L corresponds to one of the m hyperedges. Consider the naive approach introduced earlier 
where we added k edges for each hyperedge (multiple edges possible), without duplicating vertices. 
Call this graph P' = (Vp, Ept). In G, we add an edge from {v, x^, ..., to {u,y^,..., y^) if and 
only if 

• There is an edge {v,u) G Epi created by a hyperedge e* for some i. 

• for all j ^ i, and 

• (x*,y*) E Ep. 

Intuitively, if we want to move from (u,...) to {v ,...) where the edge {u, v) E Ep/ is created by 
a hyperedge e*, then we need to move the ith coordinate by an edge of L (other coordinates stay 
put). Once we changed the ith. coordinate, since L has girth k, we have to use an edge formed by 
e* at least k times to move zth coordinate back to the original solution. 

Suppose C = ((x^, ...),••• , ,...)) is a cycle in G. By the above argument, (x^,..., v^') is a 

cycle of P’, and must use at least k edges formed by a single hyperedge, say eK This is not quite 
enough to argue that this cycle intersects a vertex cover of P as the same edge of P' that is created 
by hyperedge e* may be used multiple times. To fix this problem, we color each edge of L by one 
of k colors and associate a different color to the k edges formed by a hyperedge. If we ensure the 
stronger property in the labeling gadget that every cycle of L must be colorful (which implies that 
the girth is at least k), then the cycle C = ((x^, ...),••• , {v^ ,••■)) uses all k edges formed by a 
single hyperedge, so it must intersect any vertex cover of P. See Figure for an example. 

For soundness, given a subset F C Vg of measure 6, we find a hyperedge e = (x^,...,x^) 
such that (Pi cloud(x*)) H F is large. This follows from averaging arguments and needs a density 
guarantee in the soundness case of /c-HVC. Then we focus on the copy of L associated with e, find 
a colorful A:-cycle in L, and produce the final cycle by combining two cycles (x^, x^), • • • , (x^, x^) 
(from Vp) and the colorful cycle in L. 

This is a complete and sound reduction from FHVC to the original FVS problem, except that 
it blows up the size of the instance exponentially. To get a polynomial time reduction, we compress 
the construction by coalescing different copies of L, retaining only a constant number (dependent 
on the degree of the original hypergraph) out of the m coordinates. However, as a result we are not 
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Figure 2: Example with fc = 3. Each row corresponds to a vertex of G in the first row), and 

each edge of P and L has one of 3 types. From {v^,x^,x^), we used ei and the solid edge to get to x^). 

The position in L 2 stays the same. From {v^,x^,x^), we used 62 and the dotted edge to get to {v‘^,x^,x^). 


able to control the behavior of long cycles, and we may not intersect all cycles in the completeness 
case of Theorem [T^ Since we have good control over the structure of cycles using labeling gadgets, 


and the only issue is to reduce the size of labels, we hope that more sophisticated variants of this 
technique might be able to prove inapproximability of FVS itself. 


Labeling Gadget. A (/c, d)-labeling gadget is a directed graph L = {Vl,El) with each edge 
colored with a color from [k] that satisfies the following three properties. 


1. Its girth is exactly k. 

2. Every cycle has at least one edge for each color. 

3. Every subset of vertices of measure at least 6 has at least one cycle (x^, ..., x^) such that 

• Its length is exactly k. 

• After an appropriate shifting, the color of (x*, is i. 

Let Vl = [B]^, where B will be determined later depending on <5 and k. For each 1 < i < /c, and 
for each xi ,... ,Xk and yi > Xj,y(i+i) > X(j+i), we add an edge of color i from 

(xi,..., Xi, ..., Xfc) to (xi,..., yi, X(j_|_ip ..., x^) . 

Intuitively, edges of color i strictly increase ith coordinate, strictly decrease {i + l)th coordinate, 
and do not change the others. 

With this construction, properties 1. and 2. can be shown easily. If a cycle uses an edge of 
color i, the ith coordinate was decreased by using this edge, and the cycle should use at least one 
edge of color {i + 1) to return. The same argument can be applied to color (i +1), (i + 2), ..., until 
the cycle uses all the colors. The following lemma shows property 3. 

Lemma 9. For A: G N and <5 > 0, there exists an integer B := B{k, 6) such that a subset S C [B]^ 
with measure at least 6 contains a k-cycle that has one edge of each color. 
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Proof. Fix a subset S C [B]^ of measure at least <5. For each x G [B]^ and i G [k], define line(x, i) := 
{y G [B]^ : {y)j = {x)j for all j / i} to be the axis-parallel line containing x and parallel to the ith 
unit vector Cj. Let surface5 map each directed line to the first point in S that the line hits. Precisely, 


surface 5 (x, i) 


avgmaxy^sn\\ne{x,i)iVi) Sn line(x, i) 7 ^ 0 
0 5 n line(x, i) = 0 , 


and S' := surface 5 (x, i). There are k ■ B^ ^ lines total {B points for each line), so IS"! < 

k • B^~^. If i? > |, there is an element in S' \ S'. Call this point (xi,... ,Xfc). For any i G [k], 
(xi,... ,Xi_i,yj,Xi+i,Xfc) is also in S for some yi > x*. ((xi,X2, • • .,Xk-i,yk), (xi,X2, •. .,yk-i,Xk), 
(xi,y 2 , • •. ,Xfc_i,Xfc), (yi,X 2 , • •. ,Xfc_i,Xfc), (xi,X 2 ,.. .,Xk-i,yk)) is a cycle we wanted. □ 


Reduction. We show a reduction from fc-HVC to Directed A:-Cycle Transversal, proving Theo¬ 


rem 
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Fix A: and let c 


:= s := 1 — e,d, 6, p be the parameters we have from Theorem 


Let k' be the maximum length of cycles that we want to intersect in the completeness case, 
which will be determined later. Let L = {Vl,El) be a (A:,/9(5)-labeling gadget. We are given a 
hypergraph P = (Vp, Ep) with the maximum degree d. Since each vertex has a degree at most d, 
each hyperedge shares a vertex with at most dk other hyperedges. Consider a graph P' = (Vp/, Pp') 
where Vp/ = Ep and there exists an edge between e and / if and only if they intersect. Define the 
distance between two hyperedges e and / to be the minimum distance between e and / in Ph The 
maximum degree of P' is bounded by dk, and for each e G Vp/, there are at most (dk)^ neighbors 
within distance k'. Therefore, each hyperedge can be colored with d' = {dk)^ + 1 colors so that 
two hyperedges within distance k' are assigned different colors. To distinguish it from the coloring 
of L, we call the former outer coloring and the latter inner coloring. We use letters u, v to denote 
the vertices of Vp, x, y for Vp, and a, b for (Vp)'^^ Furthermore, since some vertices are indexed by 
a vector, we use superscripts to denote different vertices (e.g. x^, x^ G Vp) and subscripts to denote 
different coordinates of a single vertex (e.g. x = (xi,..., x^)). 


Our reduction will produce a directed graph G = {Vg,Eg) where Vg = Vp x {VpY' = Vp x 
{\B]^Y'. The number of vertices (from P to G) is increased by a factor of iVpl*^^ = |. 
Since |Vp| and dk + 1 only depend on k, this quantity is polynomial in |Vp| if k' = 0(loglog |Vp|). 
The edges of G are constructed as the following: 


• For any e = {v^,..., v^) G Ep, let q G [d'] be its (outer) color. 

• For any i G [A;], 

• For any x,y ^ Vl such that (x, y) G Pp with inner color i, 

• For any a G {VlY' , 

• We put an edge {v^,aq^x) to Uq^y) with outer color q and inner color i, where Oq^x 

means that the gth outer coordinate of a (which is an element of Vp) is replaced by x. 

For a vertex (v, a) G Vg, consider a = (x^,..., x'^ ) as a label which is a d'-diniensional vector and 
each coordinate x* corresponds to a vertex of L. Following one edge with outer color q changes 
only x'^ (according to L), while leaving the other coordinates unchanged. Based on this fact, it is 
easy to prove the following lemmas. 

Lemma 10. G has girth at least k. 


28 






Proof. From the above discussion, each edge of G acts like an edge for exactly one copy of L and 
acts like a self-loop for the other copies of L. If ((u^,o^),..., (u^,a*)) is a cycle in G, then each 
coordinate of a* is a cycle in L as well. Since L has girth k, G also has girth at least k. □ 

Definition 2 (Canonical cycles). For any hyperedge e = ... ,v^) of P with outer color q, for 

any cycle ,... ,x^ of L such that (x*, is colored i for i = 1,2,... ,k, and for any a € {VlY > 

((u^, , {v^, is also a cycle of G of length exactly k. Call such cycles canonical. 

Lemma 11. Suppose k < I < k', and ... ,{u\a’')) be a cycle. Then, there exists a 

hyperedge e such that e C ..., . 

Proof. Let one of the edges of the cycle have outer color q. By the properties of L (corresponding 
to outer color q), for each i € [k], there must be an edge with outer color q and inner color i. Since 
the distance between two hyperedges with the same outer color is at least k', every edge with outer 
color q must be from the same hyperedge, say e = (u^,..., v^). 

By the property 2. of the labeling gadget corresponding to outer color q (equivalently hyperedge 
e), for every inner color j, ((u^, a ^),... ,{u\a^)) must use an edge with inner color j and outer color 
q. Notice that if ((u®, a*), is with outer color q and inner color j, u® = and 

^(*+1) _ Therefore, e C ..., □ 

Completeness. 

Lemma 12. Recall that k' = 0(loglog IVgI). If P has a vertex cover of measure c, G has a k'-cycle 
transversal of measure c. 

Proof. Let C C Vp be such that it has measure c and intersects every hyperedge e G Ep. Let 
F = G X [VlY C Vg. It is clear that F has measure c. We argue that F indeed intersects every 
cycle of length at most k'. For every cycle ((u^, a ^),... ,{u\al‘)) of length k < I <k’ ,hy Lemma 
there exists a hyperedge e = (u^,..., v^) such that e C |tt^,..., u^}. Since C is a vertex cover for 
P, there exists u® G C, so T D u® x {VpY intersects this cycle. □ 

Soundness. 

Lemma 13. If every subset of Vp with measure at least 5 contains a p fraction of hyperedges in 
the induced subgraph, every subset of Vq with measure 26 contains a canonical cycle. 

Proof. Let I F Vq has measure at least 26. For a G (Vl)®^ , we let slice(a) := Vp x a to be the 
copy of Vp associated with a. Let ^ = |a G {VpY' : pp{s\'\ce{a) D I) > dj. An averaging argument 
shows that p^y^-^d'{A) > 6. By the soundness property (with density) of /c-HVC, for each a G A, 
slice(a) n I C Vp contains at least p fraction of hyperedges. Therefore, if we consider the product 
space Ep x Vf, at least p6 fraction of tuples (e, o) satisfy e C slice(a) n I. 

By an averaging argument with respect to Ep, we can conclude that there exists a hyperedge 
e = {v^,... ,v^) such that p6 fraction of a = (x ^,... ,x^ ) G {VpY satisfies e C slice(a) nl. Without 
loss of generality, assume that its outer color is 1. Another averaging argument with respect to 
x ^,... ,x‘^' shows that there exists (y^,..., y'^') such that A" := |x G L | e C slice((x, y^,..., y'^')) n I 
satisfies Pl{X) > pd. 
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Since L is a {k, /9(5)-labeling gadget, there exists a cycle (x^,..., x^) C X such that (x*, x*+^) is 
colored with i. Our final cycle of G consists of 





for each i G [k]. Note that (u*,x*,y^,... ,y‘^') G I for each i since by the definition of X, for each 
X* G X, e C slice(x*, ... ,y^ ) H I. The edge 






exists by the construction. 


□ 
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